Discovering foodborne illness in online restaurant reviews.
نویسندگان
چکیده
Objective We developed a system for the discovery of foodborne illness mentioned in online Yelp restaurant reviews using text classification. The system is used by the New York City Department of Health and Mental Hygiene (DOHMH) to monitor Yelp for foodborne illness complaints. Materials and Methods We built classifiers for 2 tasks: (1) determining if a review indicated a person experiencing foodborne illness and (2) determining if a review indicated multiple people experiencing foodborne illness. We first developed a prototype classifier in 2012 for both tasks using a small labeled dataset. Over years of system deployment, DOHMH epidemiologists labeled 13 526 reviews selected by this classifier. We used these biased data and a sample of complementary reviews in a principled bias-adjusted training scheme to develop significantly improved classifiers. Finally, we performed an error analysis of the best resulting classifiers. Results We found that logistic regression trained with bias-adjusted augmented data performed best for both classification tasks, with F1-scores of 87% and 66% for tasks 1 and 2, respectively. Discussion Our error analysis revealed that the inability of our models to account for long phrases caused the most errors. Our bias-adjusted training scheme illustrates how to improve a classification system iteratively by exploiting available biased labeled data. Conclusions Our system has been instrumental in the identification of 10 outbreaks and 8523 complaints of foodborne illness associated with New York City restaurants since July 2012. Our evaluation has identified strong classifiers for both tasks, whose deployment will allow DOHMH epidemiologists to more effectively monitor Yelp for foodborne illness investigations.
منابع مشابه
Using Online Reviews by Restaurant Patrons to Identify Unreported Cases of Foodborne Illness — New York City, 2012–2013
While investigating an outbreak of gastrointestinal disease associated with a restaurant, the New York City Department of Health and Mental Hygiene (DOHMH) noted that patrons had reported illnesses on the business review website Yelp (http://www.yelp.com) that had not been reported to DOHMH. To explore the potential of using Yelp to identify unreported outbreaks, DOHMH worked with Columbia Univ...
متن کاملA for Effort? Using the Crowd to Identify Moral Hazard in NYC Restaurant Hygiene Inspections
From an upset stomach to a life-threatening foodborne illness, getting sick is all too common after eating in restaurants. While health inspection programs are designed to protect consumers, such inspections typically occur at wide intervals of time, allowing restaurant hygiene to remain unmonitored in the interim periods. Information provided in online reviews may be effectively used in these ...
متن کاملHealth Department Use of Social Media to Identify Foodborne Illness — Chicago, Illinois, 2013–2014
An estimated 55 million to 105 million persons in the United States experience acute gastroenteritis caused by foodborne illness each year, resulting in costs of $2-$4 billion annually. Many persons do not seek treatment, resulting in underreporting of the actual number of cases and cost of the illnesses. To prevent foodborne illness, local health departments nationwide license and inspect rest...
متن کاملUsing Twitter to Identify and Respond to Food Poisoning: The Food Safety STL Project
CONTEXT Foodborne illness affects 1 in 4 US residents each year. Few of those sickened seek medical care or report the illness to public health authorities, complicating prevention efforts. Citizens who report illness identify food establishments with more serious and critical violations than found by regular inspections. New media sources, including online restaurant reviews and social media p...
متن کاملSupplementing Public Health Inspection via Social Media.
Foodborne illness is prevented by inspection and surveillance conducted by health departments across America. Appropriate restaurant behavior is enforced and monitored via public health inspections. However, surveillance coverage provided by state and local health departments is insufficient in preventing the rising number of foodborne illness outbreaks. To address this need for improved survei...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of the American Medical Informatics Association : JAMIA
دوره شماره
صفحات -
تاریخ انتشار 2018